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The evolution of the internet of things as a promising and modern 
technology has facilitated daily life. Its emergence was accompanied by 
challenges represented by its frequent exposure to attacks and its being a 
target for intruders who exploit the gaps in this technology in terms of the 
nature of its heterogeneous data and its large quantity. This made the study 
of cyber security an urgent necessity to monitor infrastructures It has 
network flaw detection and intrusion detection that helps protect the network 
by detecting attacks early and preventing them. As a result of advances in 
machine learning techniques, especially deep learning and its ability to self- 
learning and feature extraction with high accuracy, the research exploits 
deep learning to analyze the real data set of CSE-CIC-IDS2018 network 
traffic, which includes normal behavior and attacks, and evaluate our deep 
model long short-term memory (LSTM), That achieves accuracy of 
detection up to 99%. 
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1. INTRODUCTION 

The technological revolution that resulted from the emergence of the Internet of things and the wide 
growth in its applications and uses, and the difficulties and challenges that accompanied this technology of 
heterogeneity, privacy and control of data security [1]-[3]. These challenges led to the necessity of network 
monitoring and intrusion detection, network intrusion detection system (NIDS) plays a key part in detecting 
attacks and the challenges in security. It monitors and distinguishes suspicious activities and analyzes traffic 
into normal and malicious. It also detects security violations, intrusion and anomalies [4]—[6]. This NIDS 
system aims to propose an infrastructure capable of detecting vulnerabilities and warning them in a smart, 
secure and reliable manner, unlike the firewall, which acts as protection only by allowing only authenticated 
networks to pass through [7]. 

Methods for detecting attacks in networks include three ways: "signature-based detection", That 
matches the signature of the known attack with the current traffic, "anomaly-based detection", Depends on 
visualizing a normal or legitimate profile obtained under normal network conditions without attacks, and 
comparing the network's actions with it for identify anomalies and "specification-based detection", This type 
depends on matching the predetermined and memorized specification with the criteria or specification to 
detect a certain programmer's operation and notify any violation of such criteria [8]—[10]. 

The protection and prevent unauthorized access for the systems can achieve by using Firewalls and 
authentication methods. But these methods lost the ability to monitor in NIDS the network traffic. While, the 
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intrusion detection system in the network monitors the incoming and outgoing flows to it [11], [12]. Most of 
the previous work on the topic of NIDS used old simulation-based data sets for experiments such as 
KDDcup99 or NSL-KDD, which does not represent real data nor reflective of real network traffic scenarios 
[13]. 

The choice of the type of database used to extract the information is of great importance as it 
supports the work of the model used in the detection. As it is important to design a model that adopts an 
effective algorithm for extracting features, so we use deep learning algorithms, which are better than machine 
learning methods because they extract features automatically and not manually, which gives high accuracy 
and detection speed, especially in the field of big data [14], [15]. Because of the importance of using real 
datasets that reflect network traffic and associated intrusions to ensure accurate evaluation of models better 
than old data sets. In this paper, the real data set from the Amazon web services (AWS) platform was used, 
where CSE-CIC-IDS2018 represents dynamic, modifiable, repeatable, scalable data, and a deep learning 
model was designed using long short-term memory (LSTM), which is one of the best deep learning models in 
dealing with prediction and obtaining an accuracy rate High in detection up to 99%. 


2. RELATED WORKS 

Due to the importance of the topic of network traffic analysis and verification, many recently 
published studies have addressed this topic. In this section the most important works that use deep learning 
for intrusion detection in network NIDS are presented. Algorithms show different performance in terms of 
detecting different attacks, and they work well with some attacks, while being poor with others. Ferrag et al. 
[16] presented compared different deep learning (DL) techniques including: "deep neural network", 
"recurrent neural networ So 


", "constrained Boltzmann machine", "deep belief networks", "convolutional neural 
networks", "deep Boltzmann machine", and "deep automatic coding". It was applied to two real datasets 
(CIC-IDS 2018, BoT-IoT) covering the latest attacks and showed that recurrent neural network (RNN) 
scored the highest detection rate for seven of the attacks which are "Brute force cross-site scripting (XSS)", 
"Brute force-web", "A denial-of-service (DoS) attack Hulk", "DoS slow hypertext transfer protocol (HTTP) 
test", "DoS attack slowloris attack", "DoS attack Goldeneye". While the network recorded convolutional 
neural the High detection rates for the remaining four types of attacks, "namely DDoS HOIC attack", "DDoS 
LOIC-UDP attack" and "Botnet". But it used a small percentage of the large data volume. 

Karatas et al. [17] presented the analysis for six intrusion detection systems using machine learning: 
"Adaboost", "Decision tree", "Gradient boosting", "Random forest", "K nearest neighbor", and "Linear 
Discriminant" Analysis algorithm. With the use of a modern dataset instead of the old data, and an attempt to 
reduce the imbalance rate in the data, which increases the detection rate. Using model synthetic minority 
oversampling technique (SMOTE). 

Lin et al. [13] viewed a proposed system to detect anomalies using LSTM long-term memory and 
Attention Mechanism (AM) to increase network training performance. The CIC-IDS 2018 data set has been 
used to train the proposed form and the results analysis has been mentioned the accuracy as 96.22% and 
detection rate15% and recall rate 96%. Kanimozhi and Jacob [15] presented a proposed system classifying a 
bot attack in banking transactions. And it was applied to the CIC-IDS 2018 data set. It was proposed to use 
several methods combined with each other and with artificial intelligence, and reliability charts were used to 
verify the expected possibilities of the items. 

Zhou and Pezaros [18] presented six methods of deep learning were applied to the CIC-AWS-2018 
dataset to detect attacks and classify Zero-Day attacks, as this data contains eight types of attacks and 
fourteen types of breaches. Recorded an intrusion detection rate of 100%, a zero-day intrusion accuracy rate 
of 96%, and a 5% false-positive rate. Basnet et al. [11] presented the capabilities of detecting breaches in 
deep learning algorithms were presented by comparing a set of deep learning implementation methods to 
detect attacks and breaches and categorizing them as PyTorch, Keras, TensorFlow, and Theanoand fast.ai. 
Kim et al. [19] have suggested A convolutional neural network (CNN) model , which converts data into 
images and executes them on the CIC-2018 dataset. The image is classified so that the size of each image is 
13x6 where this set contains 78 features. 


3. METHOD 

This part introduces the design of a model for network intrusion detection that using deep 
learning technique, which overcomes the high dimensions of network traffic content. To increase the 
effectiveness of detection, use a system based on anomaly detection and classification of various attacks, 
and the system was implemented on a real data set that includes all attacks, which is the CSE-CIC- 
IDS2018, where the data is initialized by implementing standard procedures on it, including (pre- 
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processing of data, selecting features, training the model, and evaluating model performance). Pre- 
processing involves collecting and arranging the data, deleting all unnecessary features, then we perform a 
normalization of all data in [-1,1] followed by implementing intrusion detection system using deep 


learning model LSTM, as shown in Figure 1. 
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Figure 1. The preprocessing for intrusion detection 


3.1. Real dataset (CSE-CIC-IDS2018) 

In this part, the type of data used to implement intrusion detection is explained. Which represents 
real and dynamic data taken from Amazon platform AWS by Communications Security Corporation (CSE) 
and Canadian Cybersecurity Institute (CIC) and represents real-time network traffic [20]. It is considered 
one of the most reliable data for evaluating intrusion detection models based on network anomalies [21]. 
This data contains the latest attacks and includes ten classes of attacks as shown in Figure 2 as columns 
and arranged according to the percentage of detection in the data: Benign, Bot, FTP-BruteForce, SSH- 
Bruteforce, DDOS attack-HOIC, DDOS attack-LOIC-UDP, DoS attacks-GoldenEye, DoS attacks-Slow 
HTTP test, intrusion, and web attacks [22]. Table | also shows the number of each attack class and its 
percentage of the original data volume. The attack infrastructure also includes 50 devices, and the victim 
organization contains 30 servers, 420 terminals, and 5 sections [23]. This data contains 80 features 
extracted using the CICFlowMeter-V3 tool [24]. And in Table 2, a set of features extracted from the traffic 
are shown. 


1e6 Distribution of data 


Data points per Class 
o 
a 
rt 


o 
D 
n 


0.2 4 


ao (i pene EE UU | | 
= + n N m o ` 


dass 


Figure 2. Distribution of the attack classes in CIC-IDS 2018 dataset 
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Table 2. Volume of data points in attack class and ratio of it 
Volume of data Ratio from the original data 


lbs number piace points in class (1252835 row) 

1 Benign 971016 77.505 % 
2 Infilteration 38703 3.089 % 
3 DoS attacks-Hulk 37323 2.979 % 
4 Bot 137185 10.95 % 
5 DDOS attack-HOIC 57507 4.59 % 

6 DDOS attack-LOIC-UDP 8377 0.669 % 
7 FTP-BruteForce 2234 0.178 % 
8 DoS attacks-GoldenEye 332 0.026 % 
9 DoS attacks-SlowHTTPTest 103 0.008 % 
10 SSH-Bruteforce 55 0.004 % 


Table 2. Sample from CIC-IDS 2018 dataset features 


Feature name Description of feature 
down_up_ratio | Download and upload ratio 
Fl/dur Flow duration 
fw/pkt/avg Average size of packet in forward direction 
fw/act/pkt Number of packets have transmission control protocol (TCP) data payload at least 1 byte in forward 
fw/pkt/std Standard deviation size of the packet in forward 
tot/bw/pk Total packets in the backward direction 
tot/fw/pk Total packets in the forward direction 
Pkt/ len/var Mini inter-arrival time of packet 


bw/pkt/max Max size of packet/backward 

bw/pkt/min Min size of packet/backward 

fw/win/byt The Number of bytes that send in initial window/forward 
bw/win/byt The Number of bytes that send in initial window/backward 
bw/hdr/len The total bytes that use in headers/ backward 

Fw/hdr/len The total bytes that use in headers/forward 


3.2. Pre-processing on dataset 

The original dataset contained 80 features. And there are some features that have little effect on 
interpreting the behavior of data and traffic, whether it is normal or not. Therefore, these features such as 
the timestamp feature and internet protocol (IP) addresses that do not help in training the neuron to detect 
errors and intrusions are deleted, so we use 78 features from the original number of features. Then, we 
divide the data set into a training set that includes 70%, and a test set that includes 30% of the original 
data. 


3.3. Long short-term memory (LSTM) 

Deep learning is a powerful method for making accurate detection and prediction of large and 
complex data such as videos, images, and texts [25]. Therefore, we use the capabilities of deep learning, 
represented by the use of multiple processing layers that teach data in multiple hidden layers [26]. Which 
contribute to increasing the accuracy and reducing the cost in the detection of attacks and malware [27]. The 
survey on techniques used in cyber security in intrusion detection [28], which described DL types and their 
way of working, showed their superiority over traditional machine learning methods, being able to extract 
features automatically instead of the method of engineering features in machine learning (ML). 

LSTM is one of the most important types of deep learning used with sequential data [29], as it is 
able to know the current traffic and the previous traffic of the network. Because the attackers carry out the 
attack as a series of continuous processes, it is important to know the current and past traffic. And it helps to 
resolve the issue of long-term reliance [30] and is considered a development on the RNN, by adding the 
forget gateway, input gateway and output gateway on the RNN model. Here, the LSTM model was used with 
the network traffic data as it is generally considered sequential and to take advantage of the capabilities of the 
LSTM in dealing with the sequential data well in practice. 


3.4. Experimental environment 

Our LSTM model implemented by Visual code program 2019 contain python version 3.9. That use 
Tensorflow [31] which includes libraries (Panda, Scikit-learn, Numpy) and Keras [32] in Windows 10 
environment. Using hardware includes CPU core i7, 4 GB memory and hard disk capacity 512 GB. 
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4. RESULTS AND DISCUSSION 

In this part, the experimental results of the LSTM model are presented. Then we present the 
evaluation of these results with the main measures, then a comparison is made between the LSTM model and 
other models. 


4.1. Results 

In this section we review the hyperparameters that must be set to avoid overfitting, as shown in the 
Table 3. The LSTM model contains three layers, the first layer includes 78 neurons, while the second layer 
contains 64 neurons, and both layers use the same ReLu activation function. The third layer contains 8 
neurons and uses the Softmax activation function. These two functions represent non-linear activation 
functions, which are faster and more accurate than linear activation functions. The loss function, which 
represents the difference between the actual and expected output values, was calculated. And to reduce the 
loss function we use optimizer Adam by calculating the loss gradients to update the values and improve the 
model results. 


Table 3. Hyperparameter of proposed LSTM model 


Parameters Name Value 
Hidden nodes in LSTM 150 
Batch size 200 
Epoch 30 
The length of flow 10 
Learning rate 0.001 
Loss function categorical_crossentropy 
Activation function Relu, Soft max 
Optimizer Adam 


4.2. Used metrics 
To evaluate the performance of the model, in this paper used three scales, namely, the accuracy 
scale and the loss scale. Accuracy is a representation of the ability to classify samples correctly as (1). 


true positive +true negative 
Accuracy SS ee a a ee a a a a (1) 
true positive+ false negative+ true negative + false negative 


Accuracy represents the proportion of samples that are properly classified. Accuracy is inversely proportional 
to the false alarm rate (FAR). The higher the accuracy, the lower the false alarm rate, Figure 3 shows the 
accuracy measurement in the training and testing phases. The loss function is the variation between the 
expected and actual output, Figure 4 shows the loss measurement in the training and testing phases. The 
confusion matrix is a graphical representation that summarizes the performance and accuracy of the 
classification process, illustrating true and false positive values, and gives an idea of the errors in which the 
model occurs and an idea of how to correct them. Predict natural and attacking packets in network traffic, 
Table 4 shows confusion matrix. 
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Figure 3. The accuracy of train stage and test stage 
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Figure 4. Loss of train stage and test stage 
Table 4. Confusion matrix 
Beni Infi DOS- Bot DDOS - DDOS- FTP- Dos- Dos-Slow SSH- 
GNER- -09 Hulk a HOIC LOIC Bruteforce GoldenEye HTTP Test Bruteforce 

Benign 339502 1 1 4 124 190 22 12 0 0 
Info 6 11 0 0 2 0 0 0 0 0 
DOS- 0 0 13546 0 0 0 0 0 0 0 
Hulk 
Bot 7 0 4 13052 0 0 0 0 0 0 
DDOS - 23 0 0 0 47992 0 0 0 0 0 
HOIC 
DDOS - 96 0 0 0 0 20032 0) 0 0 0 
LOIC 
FTP- 13 0 0 0 0 0 2919 0 0 0 
Bruteforce 
Dos- 22 0 0 0 0 0 2 758 0 0 
GoldenEye 
Dos-Slow 0 0 0 0 0 0 0 0 116 0 
HTTP Test 
SSH- 24 9 0 0 3 0 0 0 0 0 
Bruteforce 


4.3. Comparative analysis 

In this part, the difference between our LSTM model and other models is presented. As shown in the 
Table 5. The proposed model in [13] detects and classifies bot attacks that pose a threat to banking 
transactions and use the dataset CSE -CIC-IDS 2018. Seth et al. [7] suggested identifying different types of 
attacks by ranking the detection ability of the classifiers and building an ensemble. Rios et al. [33]suggested 
the use of a broad learning system (BLS) that achieves good performance in less training time to detect 
cyber-denial of service attacks in telecommunication networks. 


Table 5. The comparison among LSTM model and another method 


Research DL and ML Data set Accuracy 
Lin et al. [13] LSTM +AM CSE-CIC-IDS 2018 96.2% 
Seth et al. [7] Light GBM + HBGB  CSE-CIC-IDS 2018 97.5% 
Rios et al. [33] CFBLS and BLS CSE-CIC-IDS 2018 97.46% 
The proposed LSTM  LSTM CSE-CIC-IDS 2018 99% 


5. CONCLUSION 

In this paper, a system for detecting intrusion in the network is proposed using deep learning 
technology. Where LSTM method was used to build the neural network that applied to CSE-CIC-IDS2018 
real data set to detect intrusion during data flow. The accuracy of the detection of the model was equal to 
99%, which is a good accuracy, but there are problems and challenges represented by the imbalance in the 
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CSE-CIC-IDS2018 dataset and its large size which may cause an fault of accuracy computing. As well as 
difficulties of designing the LSTM model, In terms of increasing the nodes and linking between the multiple 
layers. Looking forward, we plan to increase accuracy, reduce error, and speed up the training process by 
using methods to identify the most relevant features that support detection method. 
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